These tutorials are intended to provide guidance for more sophisticated applications of \poy that involve multiple steps and a combination of different commands. Each tutorial contains a \poy script that is followed by detailed commentaries explaining the rationale behind each step of the analysis. Although these analyses can be conducted interactively using the \emph{Interactive Console} or running separate sequential analyses using the \emph{Graphical User Interphace}, the most practical way to do this is to use \poy scripts (see \emph{ POY4 Quick Start} for more information on \poy scripts).

It is important to remember that the numerical values for most command arguments will differ substantially depending on type, complexity, and size of the data. Therefore, the values used here should not be taken as optimal parameters.

The tutorials use sample datasets that are provided with \poy installation but can also be downloaded from the \poy site at
\begin{center}
\texttt{http://research.amnh.org/scicomp/projects/poy.php}
\end{center}
The minimal required items to run the tutorial analyses are the \poy application and the sample datafiles. Running these analyses requires some familiarity with \poy interface and command structure that can be found in the preceding chapters.

\section{Combining  search strategies}{\label{tutorial4.1}}
The following script implements a strategy for a thorough search. This is accomplished by generating a large number of independent initial trees by random addition sequence and combining different search strategies that aim at thoroughly exploring local tree space and escape the effect of composite optima by effectively traversing the tree space. In addition, this script shows how to output the status of the search to a log file and calculate the duration of the search. 

\begin{verbatim}
(* search using all data *)
read("9.fas","31.ss", aminoacids:("41.aa"))
(* We select as root the taxon with name t1. If we wanted
the taxon Locusta_migratoria, we would write:
root:"Locusta_migratoria" *)
set(seed:1,log:"all_data_search.log",root:"t1")
report(timer:"search start")
transform(tcm:(1,2),gap_opening:1)
build(250)
swap(threshold:5.0)
select()
perturb(transform(static_approx),iterations:15,ratchet:(0.2,3))
select()
fuse(iterations:200,swap())
select()
report("all_trees",trees:(total),"constree",graphconsensus,
"diagnosis",diagnosis)
report(timer:"search end")
set(nolog)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* search using all data *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read("9.fas","31.ss", aminoacids:("41.aa"))}
This command imports all the nucleotide sequence datafiles (all files with the extension \texttt{.seq}), a morphological datafile \texttt{morph.ss} in Hennig86 format, and an aminoacid datafile \texttt{myosin.aa}.
\item \texttt{set(seed:1,log:"all\_data\_search.log",root:"t1")} The \poycommand{set} command specifies conditions prior to tree searching. The \poyargument{seed} is used to ensure that the subsequent randomization procedures (such as tree building and selecting) are reproducible. Specifying the log produces a file, \texttt{all\_data\_search.log} that provides an additional means to monitor the process of the search. The outgroup (\texttt{taxon1}) is designated by the \poyargument{root}, so that all the reported trees have the desired polarity. By default, the analysis is performed using direct optimization.
\item \texttt{report(timer:"search start")} In combination with \texttt{report(timer:\\"search end")}, this commands reports the amount of time that the execution of commands enclosed by \poyargument{timer} takes. In this case, it reports how long it takes for the entire search to finish. Using timer is useful for planning a complex search strategy for large datasets that can take a long time to complete: it is instructive, for example, to know how long a search would last with a single replicate (one starting tree) before starting a search with multiple replicates.
\item \texttt{transform(tcm:(1,2),gap\_opening:1)} This command sets the \\transformation cost matrix for molecular data to be used in calculating the cost of the tree. Note, that in addition to the substitution and indel costs, the \poycommand{transform} specifies the cost for gap opening.
\item \texttt{build(250)} This commands begins tree-building step of the search that generates 250 random-addition trees. A large number of independent starting point insures that a large portion of tree space have been examined.
\item \texttt{swap(threshold:5.0)} \poycommand{swap} specifies that each of the 250 trees is subjected to alternating SPR and TBR branch swapping routine (the default of \poy). In addition to the most optimal trees, all the suboptimal trees found within 5\% of the best cost are thoroughly evaluated. This step ensures that the local searches settled on the local optima.
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{perturb(transform(static\_approx),iterations:15,ratchet:\\(0.2,3))} This command subjects the resulting trees to 15 rounds of ratchet, re-weighting 20\% of characters by a factor of 2. During ratcheting, the dynamic homology characters are transformed into static homology characters, so that the fraction of nucleotides (rather than of sequence fragments) is being re-weighted. This step, that begins at multiple local maxima, is intended to further traverse the tree space in search of a global optimum.
\item \texttt{fuse(iterations:200,swap())} In this step, up to 200 swappings of subtrees identical in terminal composition but different in topology, are performed between pairs of best trees recovered in the previous step. This is another strategy for further exploration of tree space. Each resulting tree is further refined by local branch swapping under the default parameters of \poycommand{swap}.
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory.
\item \texttt{report("all\_trees",trees:(total),"constree",\\graphconsensus,"diagnosis",diagnosis)} This command produces a series of outputs of the results of the search. It includes a file containing best trees in parenthetical notation and their costs (\texttt{all\_trees}), a graphical representation (in PDF format) of the strict consensus (\texttt{constree}), and the diagnoses for all best trees (\texttt{diagnosis}).
\item \texttt{report(timer:"search end")} This command stops timing the duration of search, initiated by the command \texttt{report(timer:"search start")}.
\item \texttt{set(nolog)} This command stops reporting any output to the log file, \texttt{all\_data\_search.log}.
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Searching under iterative pass}{\label{tutorial4.2}}
The following script implements a strategy for a thorough search under iterative pass optimization. The iterative pass optimization is a very time consuming procedure that makes it impractical to conduct under this kind of optimization (save for very small datasets that can be analyzed within reasonable time). The iterative pass, however, can be used for the most advanced stages of the analysis for the final refinement, when a potential global optimum has been reached through searches under other kinds of optimization (such as direct optimization). Therefore, this tutorial begins with importing an existing tree (rather than performing tree building from scratch) and subjecting it to local branch swapping under iterative pass.

\begin{verbatim}
(* search using all data under ip *)
read("9.fas","31.ss",aminoacids:("41.aa"))
read("inter_tree.tre")
transform(tcm:(1,2),gap_opening:1)
set(iterative:approximate:2)
swap(around)
select()
report("all_trees",trees:(total),"constree", graphconsensus,
"diagnosis",diagnosis)
transform ((all, static_approx))
report ("phastwinclad.ss", phastwinclad)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* search using all data under ip *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read("9.fas","31.ss",aminoacids:("41.aa"))} This command imports all the nucleotide sequence datafiles (all files with the extension \texttt{.seq}), a morphological datafile \texttt{morph.ss} in Hennig86 format, and an aminoacid datafile \texttt{myosin.aa}.
\item \texttt{read("inter\_tree.tre")} This command imports a tree file, \texttt{inter\_tree.tre}, that contains the most optimal tree from prior analyses. 
\item \texttt{transform(tcm:(1,2),gap\_opening:1)} This command sets the transformation cost matrix for molecular data to be used in calculating the cost of the tree. Note, that in addition to the substitution and indel costs, the \poycommand{transform} specifies the cost for gap opening.
\item \texttt{set(iterative:approximate:2)} This command sets the optimization procedure
    to iterative pass such that approximated three dimensional alignments generated using pairwise alignments will be considered.  The program will iterate either two times, or until no further tress cost improvements can be made.
\item \texttt{swap(around)} This commands specifies that the the imported tree is subjected to alternating SPR and TBR branch swapping routine (the default of \poy) following the trajectory of search that completely evaluates the neighborhood of the tree (by using \poyargument{around}).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory.
\item \texttt{report("all\_trees",trees:(total),"constree",\\graphconsensus,"diagnosis",diagnosis)} This command produces a series of outputs of the results of the search. It includes a file containing best trees in parenthetical notation and their costs (\texttt{all\_trees}), a graphical representation (in PDF format) of the strict consensus (\texttt{constree}), and the diagnoses for all best trees (\texttt{diagnosis}).
\item \texttt{transform ((all, "static\_approx"))} This command transforming all data into static homology characters corresponding to their implied alignments is necessary before reporting the data in the Hennig86 format.
\item \texttt{report ("phastwinclad.ss", phastwinclad)}  This command produces a file in the Hennig86 format which can be imported into other programs, such as WinClada.
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Bremer support}{\label{tutorial4.3}}

This tutorial builds on the previous tutorials to illustrate Bremer support 
calculation on trees constructed using dynamic homology characters
    
   \begin{verbatim}
(* Bremer support part 1: generating trees *)
read("18s.fas","28s.fas")
set(root:"Americhernus")
build(200)
swap(all,visited:"tmp.trees", timeout:3600)
select()
report("my.tree",trees)
exit()

(* Bremer support part 2: Bremer calculations *)
read("18s.fas","28s.fas","my.tree")
report("support_tree.pdf",graphsupports:bremer:"tmp.trees")
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* Bremer support part1: generating trees *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes. 
\item \texttt{read("18s.fas","28s.fas")} This command imports the nucleotide sequence files \texttt{18s.fas, 28s.fas}.
\item \texttt{set(root:"Americhernus")} The \poycommand{set} command specifies conditions prior to tree searching. The outgroup (\texttt{Americhernus}) is designated by the \poyargument{root}, so that all the reported trees have the desired polarity.     
\item \texttt{build(200)} This commands initializes tree-building and generates 200 random-addition trees.      
\item \texttt{swap(all,visited:"tmp.trees", timeout:3600)} The \poycommand{swap} command specifies that each of the trees be subjected to an alternating SPR and TBR branch swapping routine (the default of \poy).  The \poyargument{all} argument turns offf all swap heuristics. The \poyargument{visited:"tmp.trees"} argument stores every visited tree in the file specified.  Although the visited tree file is compressed to accommodate the large number of trees it will accumulate, the argument \poyargument{timeout} can be used to limit the number of seconds allowed for swapping also limiting the size of the file.  Alternately  the  \poycommand{swap} command can be performed as a separate analysis and terminated at the users discretion to maximize the number of trees generated.
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report("my.tree",trees)} This command will save the swapped tree, \\ \texttt{my.tree} to a file. 
\item \texttt{exit()} This commands ends the \poy session.

\item \texttt{(* Bremer support part 2: Bremer calculations *)}  A comment indicating the intent of the commands which follow.
\item \texttt{read("18s.fas","28s.fas","my.tree")} This command imports the nucleotide sequence files \texttt{18s.fas, 28s.fas} and the tree file, \texttt{my.tree} for which the support values will be generated.  It is important to only read the selected \texttt{"my.tree"} file rather than the expansive  \texttt{"tmp.trees"} file which will be used in bremer calculations.
\item \texttt{report("support\_tree.pdf",graphsupports:bremer:"tmp.trees")} \\The \poycommand{report} command in combination with a file name and the \\ \poyargument{graphsupports} generates a pdf file designated by the name \texttt{support\_tree.pdf} with bremer values for the selected trees held in \texttt{tmp.trees}.  It is strongly recommended that this more exhaustive approach is used for calculating Bremer supports rather than simply using the \\ \poyargument{graphsupports} default s.  
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Jackknife support}{\label{tutorial4.4}}

This tutorial illustrates calculating Jackknife support values for trees constructed with static homology characters.  Although it is possible to calculate both Jackknife and Bootstrap support values for trees constructed using dynamic homology characters, it is not recommended because resampling of dynamic characters occurs at the fragment (rather than nucleotide) level. Alternately dynamic homology characters can be converted to static characters using the transform argument \poyargument{static\_approx}, as it is common for biologists to want to  compute support values by resampling the characters from a fixed alignment.  In jackknife support a specified number of pseudo-repicates are performed independently such that in each one a percentage of characters is selected at random, without replacement.  The frequency of clade occurrence is its jackknife value.  

\begin{verbatim}
(* Jackknife support using static nucleotide characters *)
read ("28s.fas")
search (max_time:0:0:5)
select ()
report ("tree_for_supports.tre", trees)
transform(static_approx)
calculate_support (jackknife:(remove:0.50, resample:1000))
report (supports:jackknife)
report ("jacktree", graphsupports:jackknife)
exit()
\end{verbatim}
\begin{itemize}
\item \texttt{(* Jackknife support using static nucleotide characters *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read("28s.fas")} This command imports nucleotide sequence file \texttt{28s.fas}.
\item \texttt{search(max\_time:0:0:5)} This command performs a search (i.e. build, swap, perturb, fuse) of the data \texttt{28s.fas} for a maximum of 5 min (note that the short search time was selected for demonstration purposes to expedite the tutorial and not as a general time recommendation for actual data analyses).   
\item \texttt{select()} This command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report("tree\_for\_supports.tre", trees)}  This command outputs a parenthetical representation of the tree \texttt{"tree\_for\_supports.tre"}.
\item \texttt{transform(static\_approx)} This command transforms the data (i.e. build, swap, perturb, fuse) of the data \texttt{28s.fas} for a maximum of 2 hours.
\item \texttt{calculate\_support(jackknife,(remove:0.50,resample:1000)} The \poycommand{calculate\_support} command generates support values as specified by the \poyargument{jackknife} argument for each tree held in memory. During each pseudoreplicate half of the characters will be deleted as specified in the argument\poyargument{remove:0.50}. 
\item \texttt{report(supports:jackknife)}  This command outputs a parenthetical representation of a tree with the support values previously calculated with the \poycommand{calculate\_support} command. 
\item \texttt{report("jacktree",graphsupports:jackknife)}  The \poycommand{report} command in combination with a file name and the \poyargument{graphsupports} generates a pdf file with jackknife values designated by the name specified (\emph{i.e.} \texttt{jacktree}). 
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Sensitivity analysis}{\label{tutorial4.5}}

This tutorial demonstrates how data for parameter sensitivity analysis is generated. Sensitivity analysis \cite{wheeler1995} is a method of exploring the effect of relative costs of substitutions (transitions and transversions) and indels (insertions and deletions), either with or without taking gap extension cost into account. The approach consists of multiple iterations of the same search strategy under different parameters, (\emph{i.e} combinations of substitution and indel costs. Obviously, such analysis might become time consuming and certain methods are shown here how to achieve the results in reasonable time. This tutorial also shows the utility of the command \poycommand{store} and how transformation cost matrixes are imported and used.

\poy does not comprehensively display the results of the sensitivity analysis or implements the methods to select a parameter set that produces the optimal cladogram, but the output of a \poy analysis (such as the one presented here) generates all the necessary data for these additional steps.

For the sake of simplicity, this script contains commands for generating the data under just two parameter  sets. Using a larger number of parameter sets can easily be achieved by replicating the repeated parts of the script and substituting the names of input cost matrixes.

\begin{verbatim}
(* sensitivity analysis *)
read("9.fas")
set(root:"t1")
store("original_data")
transform(tcm:"111.txt")
build(100)
swap(timeout:3600)
select()
report("111.tre",trees:(total) ,"111con.tre",consensus,
"111con.pdf",graphconsensus)
use("original_data")
transform(tcm:"112.txt")
build(100)
swap(timeout:3600)
select()
report("112.tre",trees:(total),"112con.tre",consensus,
"112con.pdf",graphconsensus)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* sensitivity analysis *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read("9.fas")} This command imports all dynamic homology nucleotide data.
\item \texttt{set(root:"t1")} The outgroup (\texttt{taxon1}) is designated by the \poyargument{root}, so that all the reported trees have the desired polarity.
\item \texttt{store("original\_data")} This commands stores the current state of analysis in memory in a temporary file, \texttt{original\_data}.
\item \texttt{transform(tcm:"111.txt")} This command applies a transformation cost matrix from the file \texttt{111.txt} to for subsequent tree searching.
\item \texttt{build(100)} This commands begins tree-building step of the search that generates 250 random-addition trees. A large number of independent starting point insures that thee large portion of tree space have been examined.
\item \texttt{swap(timeout:3600)} \poycommand{swap} specifies that each of the 100 trees build in the previous step is subjected to alternating SPR and TBR branch swapping routine (the default of \poy). The argument \poyargument{timeout} specifies that 3600 seconds are allocated for swapping and the search is going to be stopped after reaching this limit. Because sensitivity analysis consists of multiple independent searches, it can take a tremendous amount of time to complete each one of them. In this example, \poyargument{timeout} is used to prevent the searches from running too long. Using \poyargument{timeout} is optional and can obviously produce suboptimal results due to insufficient time allocated to searching. A reasonable timeout value can be experimentally obtained by the analysis under one cost regime and monitoring time it takes to complete the search (using the argument \poyargument{timer} of the command \poycommand{set}). The advantage of using \poyargument{timeout} is saving time in cases where a local optimum is quickly reached and the search is trapped in its neighborhood.
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory.
\item \texttt{report("111.tre",trees:(total) ,"111con.tre",consensus,\\"111con.pdf", graphconsensus)} This command produces a file containing best tree(s) in parenthetical notation and their costs (\texttt{111.tre}), a a file containing the strict consensus in parenthetical notation \\(\texttt{111con.tre}), and a graphical representation (in PDF format) of the strict consensus (\texttt{111con.pdf}).
\item \texttt{use("original\_data")} This command restored the original (non-trans\-formed) data from the temporary file \texttt{original\_data} generated by \poycommand{store}.
\item \texttt{transform(tcm:"112.txt")} This command applies a different transformation cost matrix from the file \texttt{112.txt} to for another round of tree searching under this new cost regime.
\item \texttt{build(100)} This commands begins tree-building step of the search that generates 100 random-addition trees. A large number of independent starting point insures that thee large portion of tree space have been examined.
\item \texttt{swap(timeout:3600)} \poycommand{swap} specifies that each of the 100 trees build in the previous step is subjected to alternating SPR and TBR branch swapping routine (the default of \poy) to be interrupted after 3600 seconds (see the description in the previous iteration of the command above).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory.
\item \texttt{report("112.tre",trees:(total),"112con.tre",consensus,\\"112con.pdf", graphconsensus)} This command produces a set of the same kinds of outputs as generated during the first search (see above) but under a new cost regime.
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Chromosome analysis: unannotated sequences}{\label{tutorial4.6}}

This tutorial illustrates the analysis of chromosome-level transformations using 
unannotated sequences, i.e., contiguous strings of sequences without prior 
identification of independent regions. Prior to attempting an analysis of  
unannotated chromosomes it is necessary to enable the \texttt {"long sequences"}
option when compiling the \texttt{POY4} program. 

\begin{verbatim}
(* Chromosome analysis of unannotated sequences *)
read(chromosome:("ua15.fas"))
transform((all,dynamic_pam:(locus_breakpoint:20,locus_indel:
(10,1.5),circular:true,median:2, min_seed_length:15, 
min_rearrangement_len:45, min_loci_len:50,median:2,swap_med:1)))
build()
swap()
select()
report("chrom",diagnosis)
report("consensustree",graphconsensus)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* Chromosome analysis of unannotated sequences *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read(chromosome:("ua15.fas"))} This command imports the unannotated chromosomal sequence file \texttt{ua15.fas}. The argument \poyargument{chromosome} specifies the characters as unannotated chromosomes.
\item \texttt{transform((all,dynamic\_pam:(locus\_breakpoint:20,locus\_indel:\\(10,1.5),circular:true,seed\_length:15,rearranged\_len:50,sig\_block\_len:50,median:2,swap\_med:1)))}  The \poycommand{trans\-form} followed by the argument \poyargument{dynamic\_pam} specifies the conditions to be applied when calculating chromosome-level HTUs (medians).  The argument \poyargument{locus\_breakpoint:20} applies a breakpoint distance between chromosome loci with the integer value determining the rearrangement cost. The argument \poyargument{locus\_indel:10,1.5} specifies the indel costs for the chromosomal segments, whereby the integer 10 sets the gap opening cost and the float 1.5 sets the gap extension cost.  As the type of chromosomal sequences being analyzed are of mitochondrial origin, the argument \poyargument{circular:true} treats each chromosome sequence as a continuous rather than linear. The argument \poyargument{min\_seed\_length:15} sets the minimum length of identical continuous fragments (seeds) at 15.  As seeds are the foundation for larger homologous blocks setting the seed length to an integer appropriate for the data is critical to optimizing the efficiency with which the program correctly identifies chromosomal fragments and detects rearrangements.  The \poyargument{min\_rearrangement\_len} argument sets the lower limit for number of nucleotides between two seeds such that each is considered independent of the other.  Independent seeds belong to separate homologous blocks such that rearrangement events between blocks can be detected.  The argument \poyargument{min\_loci\_len} provides the integer value determining the minimum number of nucleotides constituting an homologous block.  In this example, because the data are mitochondrial containing relatively short homologous tRNA sequences, both the \poyargument{min\_rearrangement\_len} and the \poyargument{min\_loci\_len} were set to values below the defaults for these arguments.  The \poyargument{median} specifies the number of best cost locus-rearrangements which will be considered for each HTU (median), while the \poyargument{swap\_med} argument specifies the maximum number of swapping iterations performed in searching for the best pairwise alignment between two chromosomes.  Because values for the \poyargument{median} and \poyargument{swap\_med} arguments set above the default (1) will significantly increase the calculation time, the default values are recommended for larger chromosomal data sets.
\item \texttt{build()} This commands begins the tree-building step of the search that generates by default 10 random-addition trees. It is highly recommended that a greater number of Wagner builds be implemented when analyzing data for purposes other than this demonstration.
\item \texttt{swap()} The \poycommand{swap} command specifies that each of the trees be subjected to an alternating SPR and TBR branch swapping routine (the default of \poy).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report("chrom",diagnosis)}  The \poycommand{report} command in combination with a file name and the \poyargument{diagnosis} outputs the optimal median states and edge values to a specified file (\texttt{chrom}). 
\item \texttt{report("consensustree",graphconsensus)}  The \poycommand{report} command in combination with a file name and the \poyargument{graphconsensus} generates a pdf strict consensus file of the trees generated (\texttt{consensustree}). 
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Chromosome analysis: annotated sequences}{\label{tutorial4.7}}

This tutorial illustrates the analysis of chromosome-level transformations using 
annotated sequences, i.e., contiguous strings of sequences with prior 
identification of independent regions delineated by pipes  \texttt{"|"}. 

\begin{verbatim}
(* Chromosome analysis of annotated sequences *)
read(annotated:("aninv2"))
transform((all,dynamic_pam:(locus_inversion:20,locus_indel:(10,
1.5),circular:false,median:1,swap_med:1)))
build()
swap()
select()
report("Annotated",diagnosis)
report("consensustree",graphconsensus)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* Chromosome analysis of annotated sequences  *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read(annotated:("aninv2"))} This command imports the annotated chromosomal sequence file \texttt{aninv2}. The argument \poyargument{annotated} specifies the characters. 
\item \texttt{transform((all,dynamic\_pam:(inversion:20,locus\_indel:\\(10,1.5),median:1,swap\_med:1)))}  The \poycommand{transform} follow\-ed by the argument \poyargument{dynamic\_pam} specifies the conditions to be applied when calculating chromosome-level HTUs (medians).  The argument \poyargument{locus\_inversion:20} applies an inversion distance between chromosome loci with the integer value determining the rearrangement cost. The argument \poyargument{locus\_indel:10,1.5} specifies the indel costs for the chromosomal segments, whereby the integer 10 sets the gap opening cost and the float 1.5 sets the gap extension cost.  The default values are applied to the arguments \poyargument{circular}  \poyargument{median} and \poyargument{swap\_med} arguments to minimize the time require for these nested search options.   To more exhaustively perform these calculations trees generated from initial builds can be imported to the program and reevaluated with values greater than 1 designated for the \poyargument{median} and \poyargument{swap\_med} arguments.
\item \texttt{build()} This commands begins the tree-building step of the search that generates by default 10 random-addition trees.  It is highly recommended that a greater number of Wagner builds be implemented when analyzing data for purposes other than this demonstration.
\item \texttt{swap()} The \poycommand{swap} command specifies that each of the trees be subjected to an alternating SPR and TBR branch swapping routine (the default of \poy).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report("Annotated",diagnosis)}  The \poycommand{report} command in combination with a file name and the \poyargument{diagnosis} outputs the optimal median states and edge values to a specified file (\texttt{Annotated}). 
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Custom alphabet break inversion characters}{\label{tutorial4.8}}

This tutorial illustrates the analysis of the break inversion character type.  Break inversion characters are generated by transforming user-defined \poyargument {custom\_alphabet} characters.  
For example, observations of developmental stages could be represented in a corresponding array such that for each terminal taxon there is a sequence of observed developmental stages which are represented by a user-defined alphabet.  To allow rearrangement as well as indel events to be considered among alphabet elements, requires either reading in the data with the \poyargument {breakinv} argument or transforming the \poyargument {custom\_alphabet} sequences read to \poyargument {breakinv} characters. 

\begin{verbatim}
(* Custom Alphabet to Breakinv characters *)
read(custom_alphabet:("ca1.fas","m1.fas"))
transform((all,custom_to_breakinv:()))
transform((all,dynamic_pam:(locus_breakpoint:20,
locus_indel:(10,1.5),median:1,swap_med:1)))
build()
swap()
select()
report("breakinv",diagnosis)
report("consensustree",graphconsensus)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* Custom Alphabet to Breakinv characters  *)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read(custom\_alphabet:("ca1.fas","m1.fas"))} This command imports the user-defined \poyargument {custom\_alphabet} character file \texttt{ca1.fas} and the accompanying transformation matrix \texttt{m1.fas}.
\item \texttt{transform((all,custom\_to\_breakinv:()))} This command transforms \poyargument {custom\_alphabet} characters to \poyargument {breakinv} characters which allow for rearrangement operations.
\item \texttt{transform((all,dynamic\_pam:(locus\_breakpoint:20,locus\_\\indel:(10,1.5),median:1,swap\_med:1)))}  The \poycommand{transform} follow\-ed by the argument \poyargument{dynamic\_pam} specifies the conditions to be applied when calculating medians. The argument \poyargument{locus\_breakpoint:20} applies a breakpoint distance calculation where the integer value specifies the rearrangement cost of \poyargument {breakinv} elements. The argument \poyargument{locus\_indel:10,1.5} specifies the indel costs for each \poyargument {breakinv} element, whereby the integer 10 sets the gap opening cost and the float 1.5 sets the gap extension cost.  The default values are applied to the \poyargument{median} and \poyargument{swap\_med} arguments to minimize the time require for these nested search options.   To more exhaustively perform these calculations trees generated from initial builds can be imported to the program and reevaluated with values greater than 1 designated for the \poyargument{median} and \poyargument{swap\_med} arguments
\item \texttt{build()} This commands begins the tree-building step of the search that generates by default 10 random-addition trees.  It is highly recommended that a greater number of Wagner builds be implemented when analyzing data for purposes other than this demonstration.
\item \texttt{swap()} The \poycommand{swap} command specifies that each of the trees be subjected to an alternating SPR and TBR branch swapping routine (the default of \poy).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report ("breakinv",diagnosis)}  The \poycommand{report} command in combination with a file name and the \poyargument{diagnosis} outputs the optimal median states and edge values to a specified file (\texttt{breakinv}). 
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}

\section{Genome analysis: multiple chromosomes}{\label{tutorial4.9}}

This tutorial illustrates the analysis of genome-level transformations using data from multiple chromosomes. 
Prior to attempting an analysis of unannotated chromosomes it is necessary to enable the \texttt {"long sequences"}
option when compiling the \texttt{POY4} program. 

\begin{verbatim}
(* Genome analysis of multiple chromosomes *)
read (genome:("gen5bp"))
transform((all,dynamic_pam:(chrom_breakpoint:80, chrom_indel:
(15,2.5),locus_inversion:20,locus_indel:(10,1.5), median:1,
swap_med:1)))
build()
swap()
select()
report("genome",diagnosis)
report("genconsensus",graphconsensus)
exit()
\end{verbatim}

\begin{itemize}
\item \texttt{(* Genome analysis of multiple chromosomes*)} This first line of the script is a comment. While comments are optional and do not affect the analyses, they provide are useful for housekeeping purposes.
\item \texttt{read(genome:("gen5bp"))} This command imports the genomic sequence file \texttt{mit5.txt}. The argument \poyargument{genome} specifies the characters as data consisting of multiple chromomsomes.
\item \texttt{transform((all,dynamic\_pam:(chrom\_breakpoint:80,chrom\_indel:(15, 2.5),locus\_breakpoint:20,locus\_indel:(10,1.5),median:1,swap\_med:1)))}  The \poycommand{transform} followed by the argument \poyargument{dynamic\_pam} specifies the conditions to be applied when calculating genome-level HTUs (medians). The argument \poyargument{chrom\_breakpoint:80} applies a breakpoint distance between chromosomes with the integer value determining the rearrangement cost. The argument \poyargument{chrom\_indel:15,1.5} specifies the indel costs for each entire chromosome, whereby the integer sets the gap opening cost and the float sets the gap extension cost.  The argument \poyargument{locus\_inversion:20} applies an inversion distance between loci with the integer value determining the rearrangement cost. The argument \poyargument{locus\_indel:10,1.5} specifies the indel costs for the chromosomal segments, whereby the integer 10 sets the gap opening cost and the float 1.5 sets the gap extension cost.  The default values are applied to the \poyargument{median} and \poyargument{swap\_med} arguments to minimize the time require for these nested search options.   To more exhaustively perform these calculations trees generated from initial builds can be imported to the program and reevaluated with values greater than 1 designated for the \poyargument{median} and \poyargument{swap\_med} arguments
\item \texttt{build()} This commands begins the tree-building step of the search that generates by default 10 random-addition trees.  It is highly recommended that a greater number of Wagner builds be implemented when analyzing data for purposes other than this demonstration.
\item \texttt{swap()} The \poycommand{swap} command specifies that each of the trees be subjected to an alternating SPR and TBR branch swapping routine (the default of \poy).
\item \texttt{select()} Upon completion of branch swapping, this command retains only optimal and topologically unique trees; all other trees are discarded from memory. 
\item \texttt{report("genome",diagnosis)}  The \poycommand{report} command in combination with a file name and the \poyargument{diagnosis} outputs the optimal median states and edge values to a specified file (\texttt{genome}). 
\item \texttt{report("genconsens",graphconsensus)}  The \poycommand{report} command in combination with a file name and the \poyargument{graphconsensus} generates a pdf strict consensus file of the trees generated (\texttt{genconsensus}). 
\item \texttt{exit()} This commands ends the \poy session.
\end{itemize}
